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WHEN ERRORS ARE PRESENT IN INDEPENDENT VARIABLES 

L. S. Gurin 

Space Research Institute, USSR Academy of Sciences 
Submitted by Deputy Director V. M. Balebabov 


Algorithms are examined for evaluating the 
parcumeters of a regression model when there are 
errors in the independent variables. The algorithms 
are fast and the estimates they yield are stable 
with respect to the correlation of errors and mea- 
surements of both the dependent variable and the 
independent variables. 


1. Formulation of the Problem 

We consider a model of a system (or physical phenomenon) / 2 ^ 

of form: 

^ » ^(x,9) , ( 1 ) 


where : 

_ — rn 

X £ R 

is the vector of the input quantities (independent variables or 
regressors) j is the vector of the parameters being 

evaluated; yf output quantity (dependent variable) . 

The literature also discusses the case when there are 
errors in the model, i.e. in reality; 

= ‘ 2 ) 


♦Numbers in the margin indicate pagination in the foreign text. 


1 



ORIGINAL PAGE IS 
OF POOR QUALITY 

where C' is a random quantity that allows for the joint influence 
of many factors, having a certain law of distribution. But this 
case does not deal with an experiment emd its errors. Hereafter 
we shall not discuss this case, intending (as well as the 
majority of authors) that the error of the model C' can be 
included in the measurement error of the queintity y. 

As a result of the experiment, n sets of measured values 
are produced: 

; 1*1,2,..., n (3) 

We must find an estimate of the parameter 0, i.e. 

dn <«> 

The literature concerning the evaluation, numbering in the /4 
thousands, owes its plenitude to the fact that the formulation 
of the problem can be improved in various manners, depending 
on the assumptions made as to the type of function f, the 
class of algorithms'^ among which the optimal is chosen, the 
initial data as to the parameters and random factors, and the 
criteria by which the algorithms are compared. 

The characteristic features of the provisional effective 
evaluation are the following: 

a) only finite algorithms are considered as candidates 
for (J )^7 

b) the evaluations are compared in terms of not one, but 
several criteria; 

c) one of the criteria of comparison of the estimates is 
the difficulty of realization (complexity) of the algorithm 
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d) the investigation favors the results of statistical 
modeling, as the asymptotic properties of finite algorithms 
and finite samples are not sufficient to make decisions as to 
the quality of the estimates. 

The concept of a provisional effective evaluation was 
introduced in [1] , where they considered the difficulty of the 
algorithms, along with precision criteria. A similar concept 
of a provisional optimal evaluation is examined in [2] in the 
solving of dyneunic problems. Hereafter we shall add the sta- 
bility (robustness) in the sense of [3,4] to the quality cri- 
teria of the estimates. The corresponding provisional effective 
estimates have been considered in [5] . In [6] both the robust- 
ness in the above sense and the robustness (guarantee) in the 
sense of [7] are considered. 

A general survey of the problem of provisional effective 
estimation as a multi-criterial problem of investigation of 
operations is given in [8]. In all the work on provisional 
effective estimation to the present it is assumed that the 
independent variables are measured without error. 

As concerns the allowance for errors in the independent 
variables, more than a hundred articles have already been devoted 
to this, as well as sections in monographs (e.g. [9-16]). With 
very few exceptions (e.g. [17]), the errors in both the dependent 
and the independent variables in these works are assumed as non- 
cor related for the different measurements (although they may be 
correlated between the individual components of the input vector 

X [181). 

The aim of the present work is to obtain provisional effec- 
tive estimates while allowing for correlated errors in the 
independent variables. Thus, we first require very simple 
estimates and, in the second place, we must allow for errors 
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(even correlated ones) in the independent variables. Thirdly# 
the estimates should be robust with regard to the correlation 
of errors# i.e. in the sense of [7]. Let us note at the outset# 
to avoid a later repetition# that the proposed estimates can 
also be made robust in the sense of [3 #41 by including them in 
the two-level scheme of [5] . It is natural that# as in the 
general case of solving multi— criterial problems# the obtained 
estimates will not be optimal when examined for each individual 
criterion. 


2. Examination of the Elementary Model 

Let us exeunine an elementary linear model: 

■y = 0x ; m.=i.; K«l 


With regard to the errors and from (3) we make the 
following assumptions : 


E^;=E?i=0; cov($i,7p.o; 



4 , 2 ,..., n 



We introduce a new quantity: 


(7) 


We then have: 

Ez, = 0j ,8, 

(following [17] we could also allow for a correlation of C and n# 
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but we shall simplify the model even further for greater 
clarity in the later arguments) . 


According to [17] , the estimate 6 is found by minimization 
of the quadratic form; 

( 9 ) 


where ; 








Vie can use various methods of minimization, but *n all 
cases a difficult algorithm results. In [17] the whole reduces 
to a solving of algebraic equations of high degree, while the 
coefficients of these equations, in turn, are found by compu- 
tations with matrices of high order. ^ 


Since a simple algorithm is required for the provisional 
effective evaluation, we shall use the following approach. We 
denote the matrices ; 


21^ = II II i 




( 10 ) 


Then (8) gives; 

We shall regard as small in comparison with We 

note that this condition may be ambiguous in meaning. It is 
fulfilled if the errors n are small in relation to those of 
which happens rather often (this explains the usually adopted 
assumption of their equaling zero) . Along with this we may also 
consider the case when the errors ^ and n are comparable, but 
the parameter 6 is small. Each case may be reduced to the other 
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by appropriate change in scale. Therefore, in future, we shall 
simply regard 6 as small. 

Under this assumption we have (I is the unit matrix) : 

Ze * ( I - z;‘- <i2> 

We introduce the designations: 

z;’ • Hag# ; z;'ZxZ;'- i^yii U3) 

Then (8) is rewritten as: 



Here we have discarded terms containing 0 to a degree higher ^ 

than the second. 


Minimizing U and taJcing account of the symmetry of the 
matrices, we get: 



(15) 


Thus we have solved the first part of our problem: an 

elementary estimate is found that allows for correlated errors 

in the independent variables (as well as the dependent) . But 

for this we have assumed the covariational matrices E and E 

y * 

as given. In order to have guaranteed estimates we shall use 
the following method (for the case of no errors in the indepen 
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and used by V, L. Gurin) . 

the covarietional matrices and have the form: 

S, > X**ScU), 

Where . (i) and E,(X) satisfy, by each of their elements, 
boundary conditions of type. 

9 « 

We have assumed that; 




are given in advance. 

An analysis of (15) shows that, under our assumptions, we 


can write: 




(18) 


a 


niw includes the terms that vanish when - 0, 
where the term 0( ) „ and C beyond the first power, 

as well as those that contain and q y 

we also note that the values figuring m 

measured, but the actual values, while: 

9Xi*yi * 

yurther, in identical manner we can write for the dispersion of 
the estimate: 

w,.. -K n- -r • 

^0(X) 
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To obtain estimates that are robust in regard to correlation we 
must find the minimax value 3 t provided that the class 

of covariational matrices is determined by expression (17) with 
fixed value of X, while the particular class of algorithms 
includes linear algorithms that are unbiased in the first approxi- 
mation (i.e. when X 0) . In examining the corresponding game 
with nature we observe tht.t, when a saddle point is present, we 
can replace the minimax with the maximin. On the other hand, 
we can consider in a first approximation that, for fixed covaria- 
tional matrices, estimate (15) has the least dispersion in the 
class of these particular algorithms. We shall prove the 
existence of the saddle point by a construction, i.e. specifi- 
cally indicating the appropriate strategies. We denote: 


V 




y ( 20 ) 




and shall taOce this as the strategy of nature. Our strategy is /IQ 
obtained from expression (15) if a^j and b^^^ correspond to the 
covariational matrices (20) . We denote the corresponding esti- 
mate by The obtained pair of strategies will be the saddle 

point when, and only when, the familiar conditions are fulfilled: 



The inequality on the right of (21) follows from our previous 
discussion. In regard to the left inequality, this is obtained 
as follows. As is evident from formula (19), when X « 0, the 
sign of the partial derivative of with respect to 

or ^yj^j agrees with the sign of x^x^ . Consequently, a certain 
neighborhood of the point X s o exists where this sign conformity 
is maintained. In view of the approximations in our theory. 
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l.e. the assumption of small values of we obtain the 
requisite Inequality from this fact and the definition (20) . 

Let us formulate the definitive results. If we know, when 

solving problem (5) -(6), that the covariational matrices of the 

measurement errors I and I satisfy the conditions (16) >(17) 

y * 

and X Is sufficiently small, we can use the estimate (15) as 
the provisional effective estimate ^ , where a^^ and b^^ have 

been found from (13) for matrices (20) . 

In the spirit of the provisional effective approach to 
the evaluation, this result requires a confirmation by statls> 
tical modeling. 

We make some further remarks. 


1. The value 9 figuring In (20) Is previously unknown, 
although it can be specified on the basis of a solution of 
analogous problems by methods that do not allow for errors In 
the independent variables. The precision with which 6 Is given 
is not especially Important in light of the fact that X Is also 
provisionally specified. We need only realize that the result 
will be more accurate as X Is smaller. In regard to the 
values of also figuring In (20) , these can be replaced 

by x^,x^, as It Is unlikely that this affects the sign of the 
product. 


2. In forming the elements of the matrices 
we must check to be sure that the obtained matrices are 
covariational; for this will be automatically fulfilled when 
X Is small, since the matrix which differs little from a 
diagonal matrix with p>osltlve elements will be positively 
determined. For E^ we can use a similar argument In constructing 
the boundary conditions. 


° o la 
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3. In the boundary conditions on the set of strategies of 
nature (16) -(17) it is assumed that the quantities: 

as well as: 

are not connected by any sort of relation. If such correlations 
obtain, the form of the estimates will be slightly nidified. 

Let us consider several examples. 

a) Usually the errors are in che form of a stationary 
random sequence, i.e. there obtains: 

I 

In this case the elements of the matrices 

si.; «"'! 

also depend solely on k and the largest or smallest possible 
ones are chosen, depending on the sign: 

z: ( 23 ) 

t«i 

b) The set of matrices can be narrowed down not only by 
multiplying with a certain factor X, but also in other ways. 

For example, let the matrices be stationary (as in Sa) and: 




10 


cmmu PM£ m 

Of POOH QUMfTY 

i.e. the boundary conditions form Markov matrices. It is then 
more convenient to introduce the factor X that reduces the set 
of matrices in following fashion: 

e''!s«U)U 

3, Example 

Let us have the relation: 

(26) 

and it is necessary to estimate 0^ and 6. in order to convert 
the problem to model (5) we shall consider that x is measured 
in an even number n » 2t of points at identical distances A 
(with precision down to the errors in assignment of x) , We 
denote : 

^ (27) 

where y^,p a..d sre the arithmetic means of the measured 

values of y and x. Then x and y are ^elated by (5) . 

We shall further emp7.oy the above-explained results for 
the estimation of 3. We first make two remarks. 

1. In connection with che allowed correlation of measure- 
ment errors and the presence of such errors in the measurement 
of X, such a conversion of model (26) to model (5) is less 
accurate than the simultaneous evaluation of 6 and 8, which 
can be done in our conditions by using the alg^ithm explained 
in the next section. But here we shall disregard it and thus 
obtain an estimate for 6^ 'after estimating 0) in the form: 


e, » -^Xcp 


(26) 
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2. Whereas at the outset there was given a class of 
alloweUsle covariational matrices for y and x, this class will 
differ for y and x and the conversion can be easily done. We 
shall not concern ourselves with this here. We merely note 
that, even for uncorrelated measurement errors of x and y, the 
measurement errors of x and y will be correlated [sic!]. 

Thus, let us assume that we are already at the place 
discussed in the preceding section. To be specific, we assume 
that the boundary conditions on the covariational matrices have 
the form (25) with fixed value of X. What is the physical 
meaning of this? We are assuming an arbitrary dispersion of 
errors y with a sufficiently small correlation, although the 
smaller the correlation the more accurate the analysis. The 
correlation matrix that majorizes the modulus is a Markov 
matrix, although in reality the errors may represent e.g. a 
stationary process of autoregression of any given order. Since 
X has been chosen for satisfactory approximation in the theory, 
we denote for brevity: 

(29) 

The Scime can be said of x and we denote: 

(30) 


Here, however, we must add that the dispersion of errors y can- 
not be arbitrary, as follows from the difference between 
formulas (25) for y and x. For simplicity we shall consider 

hereafter that, as stipulated above, the scale of x is chosen 

2 2 

so that (25) is satisfied by the agency of 6 , whereas o (1) 

2 2 ^ 
and even a (X) may be comparable to a . 


Thus, to obtain guaranteed estimates we everywhere replace 
2 2 

and by the ^ upper limits. In regard to the nondiagonal 
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elements of the covariational matrices: 

V 

and S ^ I 

these are taken as the largest or smallest values, depending on 
the sign of ip(k) from formula (23). We note that, as already 
mentioned above in regard to (20) , instead of x. we can insert 
in (23) their real values or any others sufficiently close to 
them. From the remarks at the outset of this section it follows 

that we can take: 

X; =r (31) 


Hence we obtain: 

i(x) , < 32 > 

to 

i.e. i;)(k) > 0 when: 

K < - a = C(i) < 23 ) 

and iMk) <0 when k > c(£). We can assume with sufficient 
accuracy that: 


/15 

The obtained matrices: 

will have negative elements only far away from the main diagonal. 

Since it lollows from (2b) that these elements will be extremely 
small, the positive determination of the matrices is assured. 

However it is not exactly easy to invert them by computer. We 
shall therefore take the following step. Considering the small 
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nature of these elements, we shall replace their si(jn by a 
positive, which little affects the later results. But now the 
matrices become Markov and they can be inverted analytically. 
To obtain a^^ and b^^ from (13) and estimate (15), we have 
only to calculate the matrix product; 





the elements of which have an analytic expression. This can 
also be done analytically (we shall not give here the corre- 
sponding formulas), but even when the computer is used it is 
much less time-consuming than a numerical matrix inversion. 


4. Investigation of a More Complex Model 


Let us have, instead of (5), a model of type; 

2 -fix, 


(35) 


where ; 

0 - •* V > 0 k *$ / 


(36) 


x'. (a:,, ar..,; ; 

For the given covariant error matrices this version has also 
been considered in [17] in nonsimplif ied form. In this case 
the quadratic form (9) is much more complicated; 



(38) 


where : 


and 


21 » II Gy (/I ^ l/Jll 


(39) 
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In (39) ifj are the numbers of the measurements, while y,v are 
the numbers of the variables, i.e. i,j = 1,2,. ..,n? y,v - 1,2,.. 

To simplify the problem, by analogy with the case k = 1, 
explained above, we shall take 6^,..., 6,, as small (which, as in 
the mentioned case, can be done by choice of the scales when 
there are small errors in . . . ,Xj^) . To simplify the later 
computations we assume that the errors of the various variables 
are not correlated with each other (this assumption need not be 
used; it merely complicates the formulas but introduces no 
essential modifications) . Then; 

^ ^ when 

and (38) becomes; 

~ , ( 40 ) 

i.e. instead of (11) we have; 

Hi (41) 


Instead of (12) we get; 


or; 




We then construct the quadratic form U, similar to (14), 
and discard terms with 6^ above the second power. As a result 

we get; 
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V 


K 

-F £ Z OcjXgcX^jQyd^ - 

Minimizing U we obtain (in view of the symmetry of the matrices 


»4v ) 


the set of equations: 


B0.C, <«> 

where the matrix B is determined as follows in terms of its 
elements (Sj^j ~ 0 when i j » =1 ) s 

6^^ = x' -Sy,5;„[Z^|, IvT^» ] , >*6> 

while the vector c has components: 

Cg ~ Xy XlCft 

From this we obtain the estimate: 

|=B'c <^8) 

Estimate (48) , just as (15) (obtained from (48) in the 
particular case k = 1) f is simplified; the matrix B# which must 
be inverted, has order k, i.e. it presents no computational 
difficulties. Moreover, in the entire calculation process it 
is only necessary to invert the high-order matrix once. 
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But we have already seen in the case of k = 1 that this usually 
can also be done analytically (especially when speaking of 
guaranteed estimates) . 


Moving on to construct provisionally effective guaranteed 
estimates we note that in this case there will be one signifi- 
cant difference from the case k = 1. The penalty function in 
the game with nature should be scalar, while the precision of 
the estimate 6 is expressed by its covariational matrix. There 


fore we shall examine not the estimate 6, but the estimate of a 
certain linear form 3'0f e.g. the predicted value of 
given values of x' = (x^^, . . . ,Xj^) . Without repeating in detail 
the remarks made in the case k = 1, we merely give the analogs 
of the respective formulas. 


Denoting by x the measurement result matrix of the inde- 
pendent variables, instead of formula (18) that obtained in the 
case k = 1 we obtain the following expression for the form 3*6: 


-zix'xfx'sXlX'xy'x'xK.]*- 


(49) 


Here: 

'*■ X+f 

(i.e. this corresponds to the vector C in (18)), while 
6x = x-x is the measurement error matrix for the independent 
variables; the terms not written out can be ignored when X 
and 6 have correspondingly small values. 

The choice of the largest or smallest of the values of 

2 

ofj(v) as candidate for will now depend on the sign of 



the product of the coefficients of 6^^ and 6^^ in expression (49) 
In order to find the coefficient of 6^^ we may proceed as follows 
By we denote the matrix that has a solitary nonzero element 
equal to unity at the intersection of the i-th row and the j-th 
column. Then the sought coefficient a^^ is equal to: 

flvt ® p t X 

This refers to the case of v = 1,2,... A. For v * k+1, of 
course: 

where the vector column c^ has unity in the i-th position, while 
the other elements are zeroes. For each specific problem the 
choice of can be done by computer. The only operation 

of matrix inversion will be that of the matrix x'x, the order 
of which is k, i.e. small. 

All the remarks made for the case of k = 1 also apply here. 
5. Confirmation of the Hypotheses 

The findings can also be used to check the hypotheses. 

Thus, in the case of k = 1, we can obtain a provisionally 
effective robust criterion to check the hypothesis of the 
absence of a trend for the quantity y as x varies, by using 
expression (15) . In order to investigate the statistical 
characteristics of this criterion we must assign boundary con- 
ditions on the covariational error matrices for the measurements 
of X and y, find the coefficients a^^ and and then obtain 

for 6=0 the distribution function (or its quantiles) of the 
criterion 0 as determined by formula (15) . This is a rather 
elaborate task, but it is only solved once and the results of 
its solution can be suiTu..arized in tables. 
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In order to illustrate by the simplest exan^le the differ- 
ence between the proposed criterion and the existing (which does 
not allow for errors in the measurement of the independent 
variable) , let us assume that the errors are not correlated 
and : 

6x * A 6*^ • 

When 0 is small, the entire theory is accurate (and k need not 
be small in this case!) and we adopt the guaranteed value: 

6 *^ a H 3 ^ • 

Then the criterion 6 becomes : 



(we recall that the measurements of and are not centered) . 
As k increases, so does 0. Thus, if x is indeed measured with 
errors, the use of a criterion that does not take this into 
account increases the likelihood of an error of the second 
kind (we assume that e = 0 is the null hypothesis) . 
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